AITopics | adaptive allocation

Learning Multiple Markov Chains via Adaptive Allocation

Neural Information Processing SystemsFeb-6-2026, 09:01:55 GMT

We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances. We present a novel learning algorithm that efficiently balances \emph{exploration} and \emph{exploitation} intrinsic to this problem, without any prior knowledge of the chains. We provide finite-sample PAC-type guarantees on the performance of the algorithm. Further, we show that our algorithm asymptotically attains an optimal loss.

artificial intelligence, learning multiple markov chain, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Finite-Time Analysis of Round-Robin Kullback-Leibler Upper Confidence Bounds for Optimal Adaptive Allocation with Multiple Plays and Markovian Rewards

Neural Information Processing SystemsDec-24-2025, 02:01:15 GMT

We study an extension of the classic stochastic multi-armed bandit problem which involves multiple plays and Markovian rewards in the rested bandits setting. In order to tackle this problem we consider an adaptive allocation rule which at each stage combines the information from the sample means of all the arms, with the Kullback-Leibler upper confidence bound of a single arm which is selected in round-robin way. For rewards generated from a one-parameter exponential family of Markov chains, we provide a finite-time upper bound for the regret incurred from this adaptive allocation rule, which reveals the logarithmic dependence of the regret on the time horizon, and which is asymptotically optimal. For our analysis we devise several concentration results for Markov chains, including a maximal inequality for Markov chains, that may be of interest in their own right. As a byproduct of our analysis we also establish asymptotically optimal, finite-time guarantees for the case of multiple plays, and i.i.d.

finite-time analysis, optimal adaptive allocation, round-robin kullback-leibler upper confidence bound, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.96)
Information Technology > Artificial Intelligence > Machine Learning (0.85)

Add feedback

Reviews: Learning Multiple Markov Chains via Adaptive Allocation

Neural Information Processing SystemsJan-25-2025, 06:04:39 GMT

This paper aims at learning a collection of transition matrices of ergodic Markov chains, where at each round the algorithm can select one of the chains and observe which state it fell in. The problem consists in designing a strategy such as the learning will occur uniformly over all chains at the best possible rate. The paper is of theoretical nature, the background on chains is properly introduced, the algorithm is clearly described and thoroughly analyzed. The paper in its current form is a stronger submission than its previous version. It is more focused, the assumptions are clearer, it is more detailed, and an overall better read.

adaptive allocation, learning multiple markov chain, transition matrix, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

Reviews: Learning Multiple Markov Chains via Adaptive Allocation

Neural Information Processing SystemsJan-25-2025, 06:04:29 GMT

All reviewers felt that this is a well-executed paper with good writing and solid results, therefore clearly worthy of acceptance. The only general complaint was that the setting may have been somewhat poorly motivated, and the authors should consider providing an illustrative motivating example in the final version of the paper.

adaptive allocation, learning multiple markov chain

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Learning Multiple Markov Chains via Adaptive Allocation

Neural Information Processing SystemsOct-10-2024, 10:07:46 GMT

We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances.

adaptive allocation, algorithm, learning multiple markov chain, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Optimizing KV Cache Eviction in LLMs: Adaptive Allocation for Enhanced Budget Utilization

Feng, Yuan, Lv, Junlin, Cao, Yukun, Xie, Xike, Zhou, S. Kevin

arXiv.org Artificial IntelligenceJul-16-2024

Large Language Models have excelled in various fields but encounter efficiency limitations due to the extensive KV cache required for long sequences inference. Many efforts try to evict non-critical cache elements during runtime, thereby reducing cache size within a given memory budget while preserving generation quality. Our reexamination of their underlying principles discerns that prevailing strategies essentially aim to minimize an upper bound of eviction loss within a specific budget allocation. However, we observe that the current practice of uniformly allocating budgets across different attention heads during the eviction procedure tends to degrade the quality of generation posten-eviction. In light of these findings, we propose a simple yet effective adaptive allocation algorithm that not only theoretically ensures its loss upper bound does not exceed that of previous uniform allocation methods, but also effectively aligns with the characteristics of the self-attention mechanism, thus practically reducing the upper bound. Further, integrating this algorithm with two of the most advanced methods yields Ada-SnapKV and Ada-Pyramid. Extensive experimental validation across 16 datasets and the Needle-in-a-Haystack test confirm that Ada-SnapKV and Ada-Pyramid achieve further enhancements, establishing new benchmarks in state-of-the-art performance.

allocation, budget allocation, eviction loss, (12 more...)

arXiv.org Artificial Intelligence

2407.1155

Country:

Asia > China (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Learning Multiple Markov Chains via Adaptive Allocation

Talebi, Mohammad Sadegh, Maillard, Odalric-Ambrym

Neural Information Processing SystemsMar-19-2020, 02:03:18 GMT

We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances.

adaptive allocation, algorithm, learning multiple markov chain, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback